Skip to content

fix: prevent stale prompt cache session overwrites#372

Closed
ndycode wants to merge 1 commit intomainfrom
fix/prompt-cache-races
Closed

fix: prevent stale prompt cache session overwrites#372
ndycode wants to merge 1 commit intomainfrom
fix/prompt-cache-races

Conversation

@ndycode
Copy link
Copy Markdown
Owner

@ndycode ndycode commented Apr 6, 2026

Problem

Overlapping requests that share a prompt-cache session can finish out of order.

Before this change, older responses were allowed to overwrite newer session-affinity state, which meant stale previous_response_id values and stale preferred-account affinity could win after a newer request had already completed.

Fix

Version session-affinity writes per request and reject stale writes.

This ensures the newest in-flight request for a session keeps control of both:

  • the stored previous_response_id
  • the preferred account affinity

Changes

  • lib/session-affinity.ts
    • add per-entry writeVersion
    • add rememberWithVersion(...)
    • ignore stale rememberWithVersion(...) and updateLastResponseId(...) calls
  • index.ts
    • assign a monotonic per-request session-affinity write version
    • use the versioned session-affinity writes in the response-id callback
  • test/session-affinity.test.ts
    • add regression coverage for stale overlapping response-id writes being ignored
  • test/index.test.ts
    • update overlapping same-session stream regression to assert the newest request wins

Validation

npx vitest run test/session-affinity.test.ts test/index.test.ts
Test Files  2 passed (2)
Tests      131 passed (131)

npx vitest run
Test Files  222 passed (222)
Tests      3314 passed (3314)

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this PR fixes stale prompt-cache session overwrites by adding a per-request monotonic writeVersion to session-affinity entries and rejecting writes whose version is older than the stored one. the core streaming mechanism is sound and the new regression test validates the primary overlapping-stream scenario.

  • correct: rememberWithVersion and updateLastResponseId both guard on entry.writeVersion > writeVersion, preventing out-of-order responses from clobbering newer state
  • correct: sessionAffinityWriteVersion at module scope keeps the counter monotonic across plugin config reloads — a reset to 0 would break stale-write prevention
  • correct: the test/index.test.ts overlapping-stream test fires callbacks out-of-order and asserts the newest account and previous_response_id win
  • concurrency issue: remember() passes Date.now() (~1.7e12) as writeVersion, while onResponseId uses integer versions (1, 2, 3 …). for non-streaming requests where responseContinuationEnabled=true and onResponseId never fires, the fallback remember() on the success path writes a timestamp-scale version and permanently blocks all subsequent integer-versioned rememberWithVersion calls for that session
  • missing coverage: no vitest test exercises the timestamp-vs-integer version contamination path; the new test at line 147 only covers integer-vs-integer comparisons

Confidence Score: 3/5

safe to merge for the streaming path; non-streaming requests with response continuation enabled can permanently freeze session affinity after one response-ID-less response

the primary overlapping-stream bug is correctly fixed and well-tested. however, the mixed integer/timestamp version space is an existing concurrency issue not fully addressed by this PR — one non-streaming response per affected session can permanently poison its versioning, and there is no test covering this path.

lib/session-affinity.ts and the fallback remember() call sites in index.ts (success path around line 2397) need attention for the version-space contamination fix

Important Files Changed

Filename Overview
lib/session-affinity.ts adds writeVersion field and rememberWithVersion/updateLastResponseId versioning — correct for integer-vs-integer comparisons but the numeric field is shared with timestamp-based writes from remember(), creating a mixed version space that can permanently freeze affinity updates
index.ts assigns monotonic sessionAffinityVersion and threads it through onResponseId correctly; fallback remember() calls on success path still use Date.now() as version rather than the per-request integer, creating the contamination path; module-scope declaration is intentional but undocumented
test/session-affinity.test.ts new regression test covers integer-vs-integer stale rejection correctly; missing coverage for timestamp-version contamination from fallback remember() path
test/index.test.ts overlapping same-session stream test updated to fire callbacks out of order and assert newest account+response-id wins — scenario and assertions are correct

Sequence Diagram

sequenceDiagram
    participant R1 as Request 1 (v=1)
    participant R2 as Request 2 (v=2)
    participant SA as SessionAffinityStore
    participant R3 as Request 3 (v=3)

    R1->>SA: rememberWithVersion(key, acc0, t1, v=1)
    R2->>SA: rememberWithVersion(key, acc1, t2, v=2)
    Note over SA: writeVersion = 2
    SA-->>R2: onResponseId(resp_second) → updateLastResponseId(v=2) ✓
    SA-->>R1: onResponseId(resp_first) → updateLastResponseId(v=1) ✗ stale, rejected
    Note over SA: stored: acc1, resp_second, writeVersion=2 ✅
    Note over R1: non-streaming fallback: remember(key, acc0)
    R1->>SA: rememberWithVersion(key, acc0, now, writeVersion=Date.now()≈1.7e12)
    Note over SA: writeVersion = 1_700_000_000_000 ⚠️
    R3->>SA: rememberWithVersion(key, acc1, t3, v=3)
    Note over SA: 1.7e12 > 3 → REJECTED ❌ affinity frozen
Loading

Comments Outside Diff (1)

  1. lib/session-affinity.ts, line 127-154 (link)

    P1 version-space contamination: integer versions permanently lose to timestamp fallback writes

    writeVersion is used as two incompatible things:

    • an integer counter (sessionAffinityVersion = 1, 2, 3 …) assigned in the onResponseId callback
    • a unix-millisecond timestamp (Date.now() ≈ 1_700_000_000_000) assigned by the remember() fallback, which calls rememberWithVersion(key, idx, now, now)

    the stale guard entry.writeVersion > writeVersion (lines 143-145 and 83-85) works correctly when both sides use the same version space — but once a timestamp-versioned entry is stored, every future integer-versioned write is rejected because 1.7e12 >> 3.

    the contamination path exists today in index.ts:

    if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
        sessionAffinityStore?.remember(sessionAffinityKey, ...);  // writeVersion = Date.now()
    }

    this fires for non-streaming requests where responseContinuationEnabled=true but onResponseId never fires (storedResponseIdForSuccess stays false). after that one call, all subsequent rememberWithVersion(…, sessionAffinityVersion) writes for that session are silently rejected, freezing the preferred account and response-id indefinitely.

    fix — thread the per-request integer version into the fallback success path in index.ts:

    if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
        sessionAffinityStore?.rememberWithVersion(
            sessionAffinityKey,
            successAccountForResponse.index,
            Date.now(),
            sessionAffinityVersion,  // ← same integer version used in onResponseId
        );
    }
    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: lib/session-affinity.ts
    Line: 127-154
    
    Comment:
    **version-space contamination: integer versions permanently lose to timestamp fallback writes**
    
    `writeVersion` is used as two incompatible things:
    - an integer counter (`sessionAffinityVersion = 1, 2, 3 …`) assigned in the `onResponseId` callback
    - a unix-millisecond timestamp (`Date.now() ≈ 1_700_000_000_000`) assigned by the `remember()` fallback, which calls `rememberWithVersion(key, idx, now, now)`
    
    the stale guard `entry.writeVersion > writeVersion` (lines 143-145 and 83-85) works correctly when both sides use the same version space — but once a timestamp-versioned entry is stored, every future integer-versioned write is rejected because `1.7e12 >> 3`.
    
    the contamination path exists today in `index.ts`:
    ```typescript
    if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
        sessionAffinityStore?.remember(sessionAffinityKey, ...);  // writeVersion = Date.now()
    }
    ```
    this fires for **non-streaming requests where `responseContinuationEnabled=true` but `onResponseId` never fires** (`storedResponseIdForSuccess` stays `false`). after that one call, all subsequent `rememberWithVersion(…, sessionAffinityVersion)` writes for that session are silently rejected, freezing the preferred account and response-id indefinitely.
    
    fix — thread the per-request integer version into the fallback success path in `index.ts`:
    ```typescript
    if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
        sessionAffinityStore?.rememberWithVersion(
            sessionAffinityKey,
            successAccountForResponse.index,
            Date.now(),
            sessionAffinityVersion,  // ← same integer version used in onResponseId
        );
    }
    ```
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Codex

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/session-affinity.ts
Line: 127-154

Comment:
**version-space contamination: integer versions permanently lose to timestamp fallback writes**

`writeVersion` is used as two incompatible things:
- an integer counter (`sessionAffinityVersion = 1, 2, 3 …`) assigned in the `onResponseId` callback
- a unix-millisecond timestamp (`Date.now() ≈ 1_700_000_000_000`) assigned by the `remember()` fallback, which calls `rememberWithVersion(key, idx, now, now)`

the stale guard `entry.writeVersion > writeVersion` (lines 143-145 and 83-85) works correctly when both sides use the same version space — but once a timestamp-versioned entry is stored, every future integer-versioned write is rejected because `1.7e12 >> 3`.

the contamination path exists today in `index.ts`:
```typescript
if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
    sessionAffinityStore?.remember(sessionAffinityKey, ...);  // writeVersion = Date.now()
}
```
this fires for **non-streaming requests where `responseContinuationEnabled=true` but `onResponseId` never fires** (`storedResponseIdForSuccess` stays `false`). after that one call, all subsequent `rememberWithVersion(…, sessionAffinityVersion)` writes for that session are silently rejected, freezing the preferred account and response-id indefinitely.

fix — thread the per-request integer version into the fallback success path in `index.ts`:
```typescript
if (!responseContinuationEnabled || (!isStreaming && !storedResponseIdForSuccess)) {
    sessionAffinityStore?.rememberWithVersion(
        sessionAffinityKey,
        successAccountForResponse.index,
        Date.now(),
        sessionAffinityVersion,  // ← same integer version used in onResponseId
    );
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: index.ts
Line: 317-319

Comment:
**`sessionAffinityWriteVersion` declared outside the plugin closure — document the intent**

lines 317-319 have zero indentation while every other per-invocation `let` immediately above and below (310-316, 320) is indented one tab inside `OpenAIOAuthPlugin`. `sessionAffinityWriteVersion` therefore lives at module scope, consistent with the pre-existing `sessionAffinityStore`.

being at module scope is actually _correct_ here — a reset to `0` on each plugin reload would cause stale writes from pre-reload requests to overwrite post-reload ones. but the mismatched indentation makes the intent opaque and risks a future author accidentally pulling it inside the closure. consider adding a short comment:
```suggestion
// module-scope: shared across plugin reloads so the counter is strictly
// monotonic for the lifetime of the process — do not move inside the closure.
let sessionAffinityWriteVersion = 0;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/session-affinity.test.ts
Line: 147-159

Comment:
**missing vitest coverage: timestamp-version contamination from `remember()` fallback**

the new regression test correctly covers integer-vs-integer stale rejection. there is no test for the mixed version-space scenario where `remember()` (timestamp version) fires for a session that already has or will receive integer-versioned writes.

suggested additional case:
```typescript
it('remember() timestamp version does not block subsequent rememberWithVersion() integer writes', () => {
    const store = new SessionAffinityStore({ ttlMs: 10_000, maxEntries: 4 });
    // integer-versioned write from onResponseId
    store.rememberWithVersion('session-a', 1, 1_000, 1);
    store.updateLastResponseId('session-a', 'resp_first', 2_000, 1);
    // simulate fallback remember() — note: in prod writeVersion = Date.now() ≈ 1.7e12
    // use a large value to reproduce the contamination
    store.rememberWithVersion('session-a', 1, 3_000, 1_700_000_000_000);
    // next integer-versioned write must NOT be rejected
    store.rememberWithVersion('session-a', 2, 4_000, 2);
    store.updateLastResponseId('session-a', 'resp_second', 5_000, 2);
    expect(store.getPreferredAccountIndex('session-a', 5_500)).toBe(2);
    expect(store.getLastResponseId('session-a', 5_500)).toBe('resp_second');
});
```
this test will fail with the current implementation, confirming the contamination bug.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: prevent stale prompt cache session ..." | Re-trigger Greptile

Overlapping requests that share a prompt cache session can finish out of
order. Older responses were allowed to overwrite newer session-affinity
state, which lets stale previous_response_id and account affinity win.

Version session-affinity writes per request so the newest in-flight
request for a session keeps control of the stored response id and
preferred account.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 6, 2026

Warning

Rate limit exceeded

@ndycode has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 51 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 51 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a67f8267-bb90-48d3-817a-c2db83ae29db

📥 Commits

Reviewing files that changed from the base of the PR and between 478f44c and 0f3f807.

📒 Files selected for processing (4)
  • index.ts
  • lib/session-affinity.ts
  • test/index.test.ts
  • test/session-affinity.test.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/prompt-cache-races
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/prompt-cache-races

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@ndycode
Copy link
Copy Markdown
Owner Author

ndycode commented Apr 6, 2026

Superseded by #387, which rebuilds the full open PR stack onto one reviewed integration branch.

@ndycode
Copy link
Copy Markdown
Owner Author

ndycode commented Apr 6, 2026

Closing in favor of #387.

@ndycode ndycode closed this Apr 6, 2026
@ndycode ndycode deleted the fix/prompt-cache-races branch April 12, 2026 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant